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Abstract 

Background: Enteroaggregative Haemorrhagic E coli (EAHEC) is a new pathogenic group of E coli characterized by 
the presence of a Wx2-phage integrated in the genomic backbone of Enteroaggregative E. coli (EAggEC). So far, 
four distinct EAHEC serotypes have been described that caused, beside the large outbreak of infection occurred in 
Germany in 201 1, a small outbreak and six sporadic cases of HUS in the time span 1992-2012. In the present work 
we determined the whole genome sequence of the \/fx2-phage, termed Phi-191, present in the first described 
EAHEC 01 1 1:H2 isolated in France in 1992 and compared it with those of the Wx-phages whose sequences were 
available. 

Results: The whole genome sequence of the Phi-191 phage was identical to that of the Wx2-phage P13374 present 
in the EAHEC O104:H4 strain isolated during the German outbreak 20 years later. Moreover, it was also almost 
identical to those of the other Wx2-phages of EAHEC O104:H4 strains described so far. Conversely, the Phi-191 phage 
appeared to be different from the Wx2-phage carried by the EAHEC 01 1 1:H21 isolated in the Northern Ireland in 2012. 
The comparison of the \/tx2-phages sequences from EAHEC strains with those from the \/tx-phages of typical 
Verocytotoxin-producing E coli strains showed the presence of a 900 bp sequence uniquely associated with 
EAHEC phages and encoding a tail fiber. 

Conclusions: At least two different Wx2-phages, both characterized by the presence of a peculiar tail fiber-coding 
gene, intervened in the emergence of EAHEC. The finding of an identical Wx2-phage in two EAggEC strains isolated 
after 20 years in spite of the high variability described for Wx-phages is unexpected and suggests that such Wx2-phages 
are kept under a strong selective pressure. 

The observation that different EAHEC infections have been traced back to countries where EAggEC infections are 
endemic and the treatment of human sewage is often ineffective suggests that such countries may represent the 
cradle for the emergence of the EAHEC pathotype. In these regions, EAggEC of human origin can extensively 
contaminate the environment where they can meet free Wx-phages likely spread by ruminants excreta. 
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Background 

Diarrheagenic Escherichia coli (DEC) are a heterogeneous 
group of pathogenic E. coli causing a wide range of enteric 
diseases in humans and animals [1]. 

Enteroaggregative E. coli (EAggEC) are a DEC pathotype 
inducing a gastrointestinal illness characterized by long- 
lasting watery, mucoid, secretory diarrhoea with low-grade 
fever and little or no vomiting [2,3]. EAggEC infections are 
a common cause of acute diarrheal illness among children 
in low-income countries, but sporadic cases and outbreaks 
are recorded in industrialized countries as well [4,5]. 

In 2011, a large E. coli outbreak struck Germany causing 
more than 4,000 human infections including 50 deaths 
[6]. The outbreak strain, an E. coli O104:H4, showed the 
presence of the typical virulence genes of EAggEC such as 
aggR, aaiC, sepA, aatA and, at the same time, it carried a 
bacteriophage conveying the genes encoding the Verocy- 
totoxin (Vtx) subtype 2a (vtx2a) [7]. In accordance with 
this genomic asset, the strain showed the Enteroaggregative 
typical "stacked brick" adhesion to cultured Hep-2 cells and 
was able to produce Vtx2 [8]. The infecting strain thus 
displayed an unusual combination of virulence features 
comprising the colonization repertoire from EAggEC 
coupled with the production of a toxin typically produced 
by Vtx-producing E. coli (VTEC), a DEC type causing 
haemorrhagic colitis and Haemolytic Uremic Syndrome 
(HUS) worldwide [1]. 

The impact of the German outbreak was so huge that 
the epidemic strain became iconic of a new DEC type: 
the Enteroaggregative Haemorrhagic E. coli (EAHEC) 
[9]. The occurrence of the German outbreak also caused 
the scientific community to look retrospectively at the 
reported HUS cases linked to infections with atypical 
VTEC types or to browse the scientific literature in order 
to assess if other EAHEC cases of infection could be re- 
trieved. It turned out that in the time period 1992-2012 a 
small outbreak and at least six sporadic cases of HUS had 
been described as being associated with EAHEC strains 
belonging to four different serotypes: OHl:H2, 086: 
HNM, O104:H4 and OHl:H21 [8,10-13]. 

The analysis of the whole genome sequence of the 
EAHEC O104:H4 that caused the German outbreak in 
2011 showed that the v£x2-phage is inserted in a bacterial 
genomic backbone typical of EAggEC [14], therefore the 
EAHEC pathotype seems to have arisen from the acquisi- 
tion of v£x2-phages by classical EAggEC strains. 

The appearance of the EAHEC group has shown that 
the stable acquisition of vto-phages seems to have occurred 
at least twice by two different DEC groups, the EAggEC 
and the atypical EPEC (aEPEC) from which the typical 
VTEC pathotype derives [15-17]. Moreover, the ability 
of v£x2-phages to infect, in the laboratory conditions, 
different E. coli pathogroups including ExPEC has been 
reported [18,19]. This observation, together with the 



isolation of Enterobacteriaceae other than E. coli producing 
Vtx from cases of human disease [20,21] suggests that 
vto-phages can infect a range of bacterial hosts wider 
than expected, confirming the pivotal role of phages in 
the evolution of bacterial pathogens. 

In the present work we determined the whole genome 
sequence of the vta;2-phage present in the first EAHEC 
ever described and compared it with that of the vt%2- 
phages present in the EAHEC O104:H4 and OHl:H21 
available in the public repositories and with those of other 
vto-phages, with the aim of investigating the mechanisms 
underlying the evolution of the EAHEC pathotype. 

Methods 

Bacterial strains 

The EAHEC 011LH2 strain ED 191 has been used to 
obtain the vta;2-phage subjected to whole genome sequen- 
cing and is part of the collections held at Istituto Superiore 
di Sanita. The strains characteristics have been described 
in a previous publication [10]. 

E. coli K12 strain LE392 [22] has been used as a propa- 
gator strain in infection experiments for the v£x2-phage 
amplification prior to sequencing. 

Determination of the vtx2-phages integration sites in the 
E. coli genome 

The vta2-phage integration site in the E. coli strain ED 
191 has been determined. The occupancy of loci sbcB, 
wrbA, yehV, Z2577, and yecE has been assessed as previ- 
ously described [18]. 

Infection experiments and phages propagation 

The EAHEC strain ED 191 has been exposed to UV 
light in order to induce the excision of phage genome 
from the bacterial chromosome [23]. In detail, the bac- 
terial strain has been grown in Luria-Bertrani (LB) broth 
(Oxoid Limited, Basingstoke Hampshire, UK) overnight 
at 37°C with vigorous shaking. The culture has been di- 
luted 1:100 in LB modified broth (LB with 0.001% 
thiamine V/V) and grown to 0.5 OD 600, pelleted and 
re-suspended in a sterile solution of CaCl 2 10 mM. The 
culture has been exposed to UV light (130 (ijoule X 100) 
in a crosslinker "Stratalinker® UV crosslinker ' (Stratagene 
Cloning Systems, La Jolla, CA, USA). After induction, the 
culture has been diluted in LB modified broth and incu- 
bated at 37°C for 5 hours with vigorous shaking. The cul- 
ture has been centrifuged and the supernatant containing 
phages particles filtered with 0,22 \im pore-filters. 100 \A 
of phage particles suspension have been added to 100 \A 
of a culture of the propagator strain E. coli LE392 grown 
in LB modified broth at 0.5 OD 600 and maintained at 
37°C for 20 minutes with static incubation. Each tube has 
been added with 3.5 ml of LB modified soft agar (LB 
modified broth with agar 7 g/L) at 42°C and immediately 
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poured on LB modified agar plates (LB modified broth 
with 15 g/L agar). Plates have been incubated overnight 
at 37°C. 

Four ml of SM buffer (100 mM NaCl, 8 mM MgS0 4 7H 2 0, 
50 mM Tris-HCl 1 M pH 7.5, Gelatin 0.002%) have been 
dispensed to each plate in order to recover phages parti- 
cles from the lytic plaques and kept overnight at 4°C. The 
phage suspension in SM has been recovered and chloro- 
form has been added at 5% final concentration. The phage 
suspension has been centrifuged at 500xg 10 minutes 
twice for removing agar debris and used to re-infect the 
propagator E. coli strain LE392 in the conditions described 
above in order to increase the phage titre. Finally, the 
phage suspension has been concentrated by using Amicon 
Ultra- 15 Centrifugal Filter Unit with Ultracel-30 tubes 
(Merck Millipore, Billerica, MA, USA) with a cut-off of 
30 KDa. Final phage titre was 7 x 10 10 PFU/ml. 

CsCI gradient and viral DNA extraction 

The suspension has been purified by Isopycnic Centrifuga- 
tion through CsCI Equilibrium gradient as described by 
Sambrook and Russell [23]. Briefly, 2 ml of the phage sus- 
pension have been added with 1.5 g of CsCI, transferred to 
ultracentrifuge tubes, which have been filled with a CsCI 
solution 0.75 g/ml. The tubes have been finally sealed with 
mineral oil and centrifuged in a Beckman ultracentrifuge 
at 154,00Qag, 8°C for 20 hours in a SW-41 rotor. The band 
containing the phage particles has been collected with a 
syringe by puncturing the tube. The recovered solution 
containing the purified phage particles has been dialyzed 
in against 10 mM NaCl, 50 mM Tris-HCl pH 8.0, 10 mM 
MgCl 2 . A final volume of 1 ml was obtained. 

The suspension has been treated by adding 100 units 
of DNase I RNase-free (New England Biolabs, USA) at 
37°C for one hour to eliminate free DNA contaminating 
the phage suspension. Finally a treatment with protein- 
ase K 50 (ig/ml at 56°C for one hour has been carried 
out to disrupt the phage capsid followed by DNA extrac- 
tion with phenol-chloroform-isoamyl alcohol [23]. Phage 
DNA concentration after the purification step was esti- 
mated to be 239.4 ng/ul. 

Library preparation and whole genome sequencing of the 
phage DNA 

Phage DNA has been sequenced with an Ion Torrent 
PGM semiconductor sequencer (Life Technologies, Carlsbad, 
USA) using the 200 bp protocol. An Ion Torrent 314 chip 
has been used following the manufacturer instructions 
(Life Technologies, Carlsbad, USA). Genomic library has 
been obtained by shearing 1 ug of DNA in blunt-ended 
fragments followed by linking the Ion Adapters using the 
protocol included in the Ion Xpress™ Plus Fragment 
Library Kit (Life Technologies, Carlsbad, USA). The sized 
and ligated fragments have been amplified by emulsion-PCR 



using the Ion OneTouch 200 Template kit and instru- 
ments (Life Technologies, Carlsbad, USA). 

Assembly and further bioinformatics analyses 

The reads resulting from the sequencing of the vtx2- 
phage DNA from the EAHEC OHl:H2 strain, termed 
Phi- 191, have been assembled in contigs by using the 
open source MIRA software integrated in the Ion Torrent 
Server. Contigs have been imported in Kodon software 
(Applied Maths NV, Sint-Martens-Latem, BE) for analysis. 
To fill in the gaps between contigs, a total of 95 primers 
have been designed and used for sequencing by Sanger 
technology using a Genetic analyzer 3130 (Life Technolo- 
gies, Carlsbad, USA). Mauve software [24] has been 
used to order the contigs using the sequence of the 
phage P13374 from the E. coli O104:H4 that caused the 
outbreak in Germany in 2011 [GenBank: NC_018846.1] 
as reference. The complete sequence of the Phi- 191 
phage has been annotated by Prokka tool on the online 
server Galaxy/CRS4 [25] and submitted to GenBank 
[GenBank: KF971864]. The G + C content has been ana- 
lysed by the GC calculator free online tool [26]. Identifi- 
cation of putative tRNA genes has been performed 
using tRNAscan-SE [27]. 

The raw sequence data (short reads) from the EAHEC 
OHl:H21 strain 226 were retrieved from the SRA data- 
base present on NCBI website [NCBI SRA: SRA055981] 
and aligned on the complete sequence of Phi- 191 phage, 
determined in the present study and used as reference, 
with the Bowtie2 free software implemented in the 
Galaxy/CRS4 server [25]. 

Genomic comparisons between the available vto-phages 
sequences have been performed by using the BLAST al- 
gorithm available at NCBI [28] and the Mauve free 
alignment software [24]. Comparison map between vtx- 
phages has been generated by Circoletto online tool 
[29,30]. For the pictogram construction, bit-score values 
have been used to describe the quality of the alignment 
at a given point. The bit-score is a normalized version of 
the score value returned by the BLAST searches, expressed 
in bits [28]. 

Ethics 

This work does not include animal testing and does not 
report human data. All the information regarding out- 
breaks and cases of infection are all from already pub- 
lished papers properly referenced in the text. 

Results 

Sequencing of the vtx2-phage from the EAHEC 01 1 1:H2 

The whole genome sequencing of the vta;2-phage from 
the EAHEC OHl:H2 strain, Phi-191, produced 320,044 
reads of a mean length of 204 bp, for a total of 65.30 Mb 
sequenced. The assembly of the Phi-191 sequence reads 
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using as a reference the genome of the PI 3374 phage 
[GenBank: NC_018846.1] produced 151 contigs that were 
further analysed using bioinformatics resources. The esti- 
mated coverage of the phage genome was 1088X. 

The length of the whole sequence of Phi- 191 resulted 
to be about 61 Kb (61,036 bp) [GenBank: KF971864] with 
a mean G + C content of 50.2%. 

Sequence annotation revealed the presence of 87 pre- 
dicted coding sequences including the genes encoding the 
subtype 2a of the Verocytotoxin and three transfer RNAs 
(tRNAs) (Figure 1). 

As it has been reported for the P 13374 phage, convey- 
ing the vtx2 genes in the EAHEC O104:H4 strain that 
caused the German outbreak in 2011 [31], the genome 
of Phi-191 (Figure 1) only included the lambda genes cl 
and cro while the other genes typically composing the 
regulatory repertoire of lambda phages such as ell, cIII, 
N, EalO and gam seemed not to be present [31]. Most 
of these genes are involved in the regulation of the 
switch between lysogeny and lytic cycle in lambda-like 



phages. Therefore, their absence suggests the existence 
of an alternative mechanism used to regulate the choice 
of entry into lysogenic state. The Phi-191 phage is inte- 
grated in the bacterial chromosome in the wrbA locus, 
as it has been described for PI 3374 [31]. 

Comparison of the Phi-191 whole genome sequence with 
other vtx-phage sequences 

A BLAST search of the Phi-191 vta;2-phage whole gen- 
ome sequence against those collected in the nucleotide 
repository held at NCBI, returned a 99% identity with 
that of the P13374 phage from O104:H4 strain CB13374 
which caused the German outbreak in 2011 [GenBank: 
NC_018846.1] and a 91% identity with that of the vtx2- 
phage TL-2011c carried by the VTEC O103:H25 strain 
that caused a severe HUS outbreak in Norway in 2006 
[GenBank:JQ011318] [32]. This finding is in agreement 
with how reported in the literature for the vto-phage car- 
ried by the EAHEC O104:H4 from the German outbreak 
and the TL-2011c phage [33]. As expected, the Phi-191 




excisionase 



Figure 1 Genomic organization of Phi-191 phage. Coding sequences are represented as blue bars in the outer circle. Putative genes are 
labelled according to the predicted functions of their products, if known. The whole GC content and GC skew of leading and lagging strand is 
shown in the black and coloured inner circles, respectively. 
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phage sequence also showed 99% identity with the vtx2- 
phage from another EAHEC O104:H4 strain isolated 
during the 2011 German outbreak, the strain 2011C-3493 
[GenBank:CP0032891.1] and was highly related (97% 
nucleotide sequence identity) to the sequence of two 
vto2-phages from two EAHEC O104:H4 strains isolated 
from as many haemorrhagic colitis cases occurred in 
Georgia in 2009 [GenBank:CP00330Ll, CP003297.1] [34]. 

The other scores included hits with query coverage 
values ranging from 87% down to 60% and having 98%- 
99% sequence identity, with a number of other vta-phages 
identified in different VTEC strains (Table 1). The align- 
ment of all the phage sequences comprised in the 100%- 
60% similarity range is shown in a pictogram generated 
with the Circoletto online tool [30] (Figure 2) and in another 
one produced by Mauve software (in Additional file 1: 
Figure SI). 

Recently, another EAHEC strain of serotype OHl:H21 
(strain 226), isolated from an HUS case occurred in Northern 
Ireland in 2012, has been described [13]. The short reads 
from the whole genome-sequencing project of this strain 
are available in NCBI sequence reads archive [NCBI SRA: 
SRA055981] and have been used for comparison. Unfor- 
tunately, among the 456 contigs obtained from the de 
novo assembly of the reads, the one including the vtx2 
genes was only 8,042 bp long, hindering the analysis of 
the entire sequence of the v£x2-encoding phage. Neverthe- 
less, its whole-genome sequencing reads were aligned to 
the complete genomic sequence from the Phi- 191 phage. 
The alignment was carried out with Bowtie2 software and 
visualized on the Integrative Genomics Viewer (IGV) free 
tool [45] (data not shown). This analysis failed to identify, 



Table 1 List of the BLAST hits of Phi-191 DNA sequence 
aligned to the vfx-phages sequences from typical VTEC 
strains 



Strain 


Similarity% 


Acc. No. 


Reference 


E. coll 01 45:H28 str. RM13514 


87% 


CP006027.1 


[35] 


E. coll 01 03:H2 str. 12009 DNA 


86% 


AP01 0958.1 


[35,36] 


Phage VT2 phi_272 


85% 


HQ424691.1 




E. coll 01 57:H7 str. TW1 4359 


67% 


CP001 368.1 


[37] 


E. coll 01 57:H7 str. EC4115 


67% 


CP001 164.1 


[38] 


E. coll 0111 :H- str. 11128 


65% 


AP01 0960.1 


[36] 


£ coll 01 57:H7 str. Sakai 


65% 


BA000007.2 


[39] 


E. coll Xuzhou21 


65% 


CP001 925.1 


[40] 


E. coll 01 57:H7 EDL933 


65% 


AE005 174.2 


[41] 


Stx2 converting phage II 


64% 


AP005 154.1 


[42] 


Stx2 converting phage I 


63% 


AP004402.1 


[43] 


Enterobacteria phage Min27 


63% 


EU31 1208.1 


[44] 


Stxl converting phage 


60% 


AP005153.1 


[42] 



The similarity score indicates the query coverage values with 98%-99% 
sequence identity. The hits with similarity values down to 60% are shown. 



in the genome of strain 226, a vta;2-phage with character- 
istics similar to that of Phi-191. In particular, the align- 
ment showed that the reads from strain 226 mapped on 
the sequence of the Phi-191 phage only in the region com- 
prising the Vtx2 A subunit-coding gene and extending 
4.2 kb downstream the gene encoding the Vtx2 B subunit. 
It should be noted that alignments conducted using the 
whole short reads set could have generated a too high 
background noise deriving from the presence of sequences 
from other bacteriophages integrated in different regions 
of the chromosome of the OHl:H21 strain 226. Never- 
theless, the lack of sequencing reads aligning to the rest of 
the sequence of Phi-191 suggests that a different vtx2 
phage intervened in the emergence of this last EAHEC 
strain. 

Genomic comparison of Phi-191 with other vtx2-phages 

In order to delve into the genomic organization of the 
different v£#2-phages we performed a progressive Mauve 
alignment between all the phages sequences used in the 
nucleotide comparison (Additional file 1: Figure SI). This 
analysis onfirmed the presence of similarities between 
the sequences but also highlighted that an extensive re- 
arrangement between the sequence-blocks must have 
occurred at some points during the phages' evolution. 
This observation is in agreement with how reported in 
literature [46-48]. 

A deeper analysis of the Phi-191 genome was conducted 
by using phage sequences selected among those from typ- 
ical VTEC displaying the highest similarity values (Table 1). 

The alignment of the Phi-191 sequence with those of 
phage TL-2011c (O103:H25), and the phages from the 
VTEC strains RM13514 (0145:H28) and 12009 (O103: 
H2), the causative agent of a romaine lettuce-associated 
outbreak occurred in the US in 2010 [35] and isolated in 
Japan from a sporadic case of diarrhoea [36], respect- 
ively, showed the presence of two sequence blocks that 
seemed to be peculiar to the EAHEC vto2-phage and di- 
vergent from the same regions in the three classical 
VTEC associated-phages (Figure 3). One of the two re- 
gions was 1,500 bp long and comprised 140 bp of the 
5'-terminus of a gene encoding a lysozyme (rrrD), two 
complete genes encoding the lysis protein S and a hypo- 
thetical protein, and 440 bp of the 3'-terminus of a gene 
annotated as a hypothetical protein. The 1,500 bp region 
corresponded to the nucleotidic positions 5,240-6,725 in 
the P13374 genome (ORFs 12-15). The other fragment 
was 900 bp long and spanned the region comprised be- 
tween nucleotides 42,160 and 43,050 in the genome of 
the P13374 phage. The 900 bp fragment shared 100% 
homology with a region of 730 bp at the S'-terminus of 
the ORF65 of P13374 phage encoding a phage tail fiber 
and a 72 bp fragment of the ORF66 coding for a tail 
fiber adesine. 
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Figure 2 Sequence similarities between Phi-191 and other vfx-phages. The picture shows the results of the BLAST local alignments using 
Phi-191 as a query against the Wx-phage sequences with 99% to 60% similarity listed in Table 1. The colours codes blue, green, orange and red 
represent the overall quality of the aligned segments along the phage sequences, evaluated on the basis of the bit-score values in the worst-to-the-best 
order (blue to red). The bit-score is a normalized version of the score value returned by the BLAST searches, expressed in bits [28]. The height of the 
coloured bars in the histogram on the top of the Phi-191 ideogram shows how many times each colour hits a specific fragment of the other phage 
sequences. A twist in a ribbon indicates that the local alignment is inverted (query and database sequence on opposite strands). 



Distribution and analysis of the putative E AH EC-associated 
sequence regions amongst the vtx-phages 

A BLAST comparison was conducted with the aim of in- 
vestigating the distribution of the two putative EAHEC- 
associated regions among all the vto-phage sequences 
available at NCBL Such an analysis showed that only the 
900 bp region encoding phage fiber tail was present in 
all the fully sequenced v£*:2-phages from EAHEC strains, 
[GenBank:CP0032891.1; CP00330L1; CP003297.1] and 
divergent or absent in all the other phage sequences in- 
vestigated. Accordingly, the presence of the sole se- 
quence of 900 bp was assessed in the contigs derived 
from the de novo assembly of the short reads from the 
genome sequence of the OHl:H21 strain 226 from 
Northern Ireland. 

Discussion 

The E. coli genome continuously changes through both 
small-scale variations and horizontal gene transfer of 



mobile genetic elements (MGE). Among MGEs, bacte- 
riophages play a pivotal role in the evolution of E. coli 
pathogenic clones [49] by providing a mean for genomic 
remodelling and conveying important virulence genes 
such as those encoding the Verocytotoxins (vtxl and 
vtx2) in the Verocyto toxin-producing E. coli (VTEC). 

In 2011 a huge outbreak caused by an Enteroaggregative 
Haemorrhagic E. coli (EAHEC) O104:H4 struck Germany 
with more than 3,000 cases of infection, 800 HUS, and 50 
deaths [12]. The causative agent was a mosaic strain deriv- 
ing from the lysogenization of an Enteroaggregative E. coli 
strain with a vto2-phage [8]. Such a virulence combination 
was indeed associated with elevate pathogenicity, as dem- 
onstrated by the high rate of human infections evolving to 
HUS, even among adults (88% and 42 years of median 
age), and the heavy toll of 50 deaths [6]. 

This arrangement of virulence factors in E. coli strains 
from human disease had been occasionally reported be- 
fore during the investigation of a small outbreak of HUS 
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Figure 3 Mauve Progressive Alignment of vfx2-phage genomes from EAHEC and VTEC strains showing the EAHEC-specific regions. 

Blocks with the same colours indicate the Wx2-phages regions with identical DNA sequence. White blocks in a phage sequence indicate regions 
lacking of correspondence in the other sequences. Fragments of the phage genomes that are peculiar to EAHEC Wx2-phages are marked with 
circles: a red circle indicates the sequence encoding a hypothetical protein; the sequence encoding the phage fiber tail is encircled in blue. 



occurred in France in 1992 and a case of infection in 
Japan in 1999 [10,11]. 

The occurrence of the German outbreak caused the 
scientific community to look back retrospectively at the 
repositories of VTEC infections records and culture col- 
lections and it turned out that some other sporadic cases 
of infections with Vtx2-producing Enteroaggregative E. 
coli O104:H4 had already occurred in Europe and Asia 
in the time span 2001-2011 [12,31]. Finally a HUS case 
occurred in Northern Ireland in 2012 was demonstrated 
to be associated with an EAHEC OHl:H21 [13]. 

The observation of sporadic cases and outbreaks oc- 
curring throughout a 20-years time span, all caused by 
Vtx2-producing EAggEC and belonging to different se- 
rotypes, strengthens the hypothesis that these patho- 
genic E. coli represent a new pathogroup, as it has been 
recently proposed [9]. 

To better understand the events underlying the emer- 
gence of EAHEC we determined the whole genome se- 
quence of Phi- 191, the vto2-phage present in the EAHEC 
OHl:H2 isolated during the French outbreak of 1992, 
and compared it with the sequences of the v£x2-phage 
present in the EAHEC strains described in the following 
years and available in GenBank. 

Interestingly, the genomic sequence of Phi- 191 was al- 
most identical to that of the v£x2-phages from the EAHEC 
O104:H4 strains isolated during the 2011 German outbreak 



about 20 years later. This is noteworthy since vto-phages 
are characterized by a high degree of variability [50,51]. It 
is conceivable that the same vta:2-phage has been acquired 
in two different events and that the selective pressure im- 
peded the accumulation of changes in the phage sequence 
before the phage infection events occurred. 

However, the EAHEC OHl:H21 isolated in Northern 
Ireland in 2012 seems to host a different type of vtx2- 
phage, suggesting that at least two different vta2-phage 
types have been successfully transferred to EAggEC. Un- 
fortunately, the sequence of the phage of the EAHEC 
086: HNM isolated in Japan in 1999 was not available 
for comparison [11]. 

It has been hypothesized that the infection with a 
lambdoid phage can be mediated by the cross-talking 
between the bacterium and the phage resulting in host 
specificity [52]. An extended comparison of the EAHEC 
vto2-phages with the whole genome sequences of vtx- 
phages from VTEC strains available at NCBI returned a 
wide range of similarities between sequences, going 
from 87% to 60% and lower. This picture is in line with 
how reported for the general variability of vta-phages 
sequences [50]. Interestingly, one region of 900 bp, 
identified in the Phi- 191 and encoding a tail fiber, was 
present in all the v£#2-phages from EAHEC and was 
also present in the short reads dataset from the EAHEC 
OHl:H21. At the same time this DNA sequence was 
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absent in all the vto-phage sequences identified in 
VTEC strains and stored at NCBL 

This is in agreement with previously reported data 
which pointed at a larger fragment including this re- 
gion as one of those differentiating the PI 3374 genome 
from the E. coli phage TL-2011c (O103:H25) and not 
exhibiting significant homology to known vto-encoding 
phages [31]. 

The annotation of the Phi- 191 genome showed that 
this sequence peculiar to EAHEC vta;2-phages contains 
part of a gene encoding a type of phage tail fiber display- 
ing some conserved aminoacidic motifs such as a Colla- 
gen triple helix repeat (20 copies) [NCBI CDD:189968] 
and the Peptidase_S74 [NCBI CDD:258151]. The latter 
is the C-terminal domain of the bacteriophage protein 
endosialidase, which forms homotrimeric molecules and 
releases itself from the end-tail-spike of the bacterio- 
phages [53]. 

The 900 bp-long sequence could potentially encode 
part of the mechanism defining the specificity of the 
vto2-phages for EAggEC strains, being directly involved 
in the phage-bacterium interaction. As a matter of fact, 
several authors reported that the interactions between 
phage tail fibers and host proteins, such as LamB and 
OmpC [52,53] contribute to the success of the infection, 
as demonstrated by the finding that lamB gene mutations 
block phage adsorption [52]. It is therefore conceivable 
that differences in phage tail fibers may contribute to de- 
fine v£x2-phages tropism for E. coli recipients, although 
this hypothesis together with the mechanisms underlying 
this process still need to be verified. 

For a successful infection to occur, suitable vto-phages 
and E. coli acceptors need to meet in the same environ- 
ment. In the case of the emergence of typical VTEC, the 
events of vto-phage acquisition probably occurred at the 
level of the gastrointestinal tract of ruminants [54] where 
both vto-phages and aEPEC are abundant [55,56]. 

Conversely, the EAHEC emergence is probably not 
directly connected to an animal reservoir since EAggEC 
are human pathogens with an inter-human transmission 
of the infection [1]. The environment, in turn, plays a role 
in the pathogens amplification cycles, particularly in geo- 
graphic areas characterised by poor hygienic conditions, 
where the lack of effective human sewage treatments 
make the infections with enteric pathogens, including 
EAggEC, endemic [57]. In such a scenario, an environ- 
ment contaminated with ruminant s excreta might have 
been the source of the vto2-phages found in EAHEC as 
it has been recently proposed [16]. Such a picture may 
account for the existence of a favourable setting for the 
EAggEC and the vto-phages to come in contact and for 
the following selection process resulting in the occasional 
emergence of an E. coli strain matching the EAHEC 
definition. 



Conclusions 

The vto2-phages characterising EAHEC seem to belong 
to a sub-population of vto-phages kept under selective 
pressure and characterised by the presence of a gene en- 
coding a tail fiber, which could be involved in the mech- 
anism used to recognize the EAggEC. The new EAHEC 
pathogroup may have emerged following multiple vtx2- 
phage acquisition events favoured by an overlapping of a 
human reservoir of pathogenic E. coli, the EAggEC, with 
the known animal reservoir of vta-phages. The emer- 
gence of this new E. coli pathogroup further witnesses 
the great adaptability and plasticity of this bacterial species 
and underlines the need to rethink the global asset of hy- 
gienic practices to mitigate enteric infections worldwide; 
particularly in the presence of a global market of food- 
stuffs that is extending its boundaries towards low-income 
countries in the quest of new sources to meet the always 
increasing demand of cheap and exotic food commodities. 
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Additional file 1: Figure SI. Mauve Progressive Alignment of Phi-191 
genome with vtx2-phage genomes from EAHEC and VTEC strains showing 
the highest score of similarity. Blocks with the same colours indicate the 
\/tx2-phages regions with identical DNA sequence. White fragments in a 
phage sequence indicate regions lacking of correspondence in the 
other sequences. Connecting lines link the same genomic block in 
different genomes and help to pinpoint the re-arrangement between 
\/fx-phage genomes. 
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